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Abstract 

qq ' We give a basic overview of computational complexity, query com- 

£NJ , plexity, and communication complexity, with quantum information in- 

corporated into each of these scenarios. The aim is to provide simple 
but clear definitions, and to highlight the interplay between the three 
scenarios and currently- known quantum algorithms. 

Complexity theory is concerned with the inherent cost required to solve in- 
VO ', formation processing problems, where the cost is measured in terms of various 

2^ ' well-defined resources. In this context, a problem can usually be thought of as 

q^ ■ a function whose input is a problem instance and whose corresponding output 

is the solution to it. Sometimes the solution is not unique, in which case the 
Q_i| problem can be thought of as a relation, rather than a function. Resources are 

usually measured in terms of: some designated elementary operations, mem- 
ory usage, or communication. We consider three specific complexity scenarios, 
which illustrate different advantages of working with quantum information: 

cr 

1. Computational complexity 



^ ■ 2. Query complexity 

3. Communication complexity. 



Despite the differences between these models, there are some intimate rela- 
tionships among them. The usefulness of many currently-known quantum al- 
gorithms is ultimately best expressed in the computational complexity model; 
however, virtually all of these algorithms evolved from algorithms in the query 
complexity model. The query complexity model is a natural setting for dis- 
covering interesting quantum algorithms, which frequently have interesting 
counterparts in the computational complexity model. Quantum algorithms 
in the query complexity model can also be transformed into protocols in the 
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communication complexity model that use quantum information (and some- 
times these are more efficient than any classical protocol can be). Also, this 
latter relationship, taken in its contrapositive form, can be used to prove that 
some problems are inherently difficult in the query complexity model. 

1 Computational complexity 

In the computational complexity scenario, an input is encoded as a binary 
string (say) and supplied to an algorithm, which must compute an output 
string corresponding to the input. For example, in the case of the factoring 
problem, for input 100011 (representing 35 in binary), the valid outputs might 
be 000101 or 000111 (representing the factors of 35). The algorithm must 
produce the required output by a series of local operations. By this, we do not 
necessarily mean "local in space", but, rather, that each operation involves a 
small portion of the data. In other words, a local operation is a transformation 
that is confined to a small number of bits or qubits (such as two or three). 
The above property is satisfied by Turing machines and circuits, and also by 
quantum Turing machines 0, 21] and quantum circuits [22, 52] (see also |39|] ). 



We shall find it most convenient to work with circuit models here. 

1.1 Classical circuits 

For classical circuits, the basic operations can be taken as the binary A (and) 
gate, the binary V (or) gate, and the unary -< (not) gate. In Fig. 1 is a 
boolean circuit consisting of five gates that computes the parity of two bits. 
The inputs are denoted as xq and xi, and the "data-flow" is from left to right. 




Figure 1: A classical circuit for computing the parity of two bits. 

The rightmost gate is designated as the output, whose value is xq © x±, as 
required. This is the smallest circuit consisting of A, V, and -i gates that 
computes the parity. Based on this fact, we could say that the computational 
complexity of the binary parity function is five. But note that this value is 
highly dependent on the specific set of basic operations that we started with. 
If we included the binary © (exclusive-or) gate as a basic operation then a 
single gate suffices to compute the parity of two bits (Fig. g) . 




Figure 2: An alternative circuit for parity with one exclusive-or gate. 

This illustrates a feature of the computational complexity model: the exact 
number of operations required to compute functions is quite sensitive to the 
technical choice of which basic operations to allow. The exact computational 
complexity of simple problems involving a small number of bits is somewhat 
arbitrary. 

Computational complexity is more meaningful when larger problems that 
scale up are considered, such as the problem of computing the parity of n bits, 
xo, x±, . . . , x n -\. Using © gates, one can construct a tree with n — 1 such gates 
that computes this parity. On the other hand, if only A, V, and -■ gates are 
available then it appears that something like 5(n — 1) gates are needed. In 
both cases, the number of gates is 0(n), and it is also straightforward to prove 
that a constant times n gates are necessary for both cases. A similar property 
holds for any computational complexity problem: changing from one set of 
gates to any other set of gates (assuming that both sets are local and universal) 
can only affect computational complexity by a multiplicative constant. Thus, 
for any / : {0, 1}* — ► {0, 1}, its computational complexity is a well-defined 
function (of n, the length of the input to /) up to a multiplicative constant. 

This is one reason why it is common to denote the computational com- 
plexity of functions using asymptotic notation that suppresses multiplicative 
constants. 0{T{n)) means bounded above by cT(n) for some constant c > 
(for sufficiently large n). Also, £l(T(n)) means bounded below by cT(n) for 
some constant c > 0, and Q(T(n)) means both 0(T(n)) and £l(T(n)). A 
circuit is polynomially-bounded in size if its size is 0(n ) for some constant d. 

A matter that we have so far obscured concerns the treatment of the pa- 
rameter n (denoting the input size). Although each circuit is for some fixed 
value of n, we are also speaking of n as a freely varying parameter. For prob- 
lems where n is a variable (such as the problem of computing the parity of 
n bits), an algorithm in the circuit model must actually be a circuit family 
(Ci, C*2, C3, . . .), where circuit C n is responsible for all input instances of size 
n. To be meaningful, a circuit family has to be uniform in that it can some- 
how be finitely specified. For example, for the aforementioned parity problem, 
a finite specification of a circuit family can be informally: "for input size n, 
C n is a binary tree of ©-gates with xq, . . . ,x ra -i at the leaves". Formally, a 



specification of a circuit family is an algorithm that maps each n to an explicit 
description of C n . Technically, we ought to include the efficiency of the speci- 
fication algorithm as part of the computational cost of a circuit family. This 
raises the question of what formalism one uses to describe the specification 
algorithm. Note that if we try to use another circuit family for this then it 
requires its own specification algorithm (and so on!), so this approach will not 
work. There are sophisticated ways of dealing with uniformity; a very simple 
way is to just use some non-circuit model, such as a Turing machine (running 
in time, say, polynomial in n) for the circuit specification algorithm. At this 
point, the reader may wonder why one does not just use the Turing machine 
model to begin with. A big advantage of circuits is that their structural ele- 
ments are simple and easy to work with — and this appears to hold for quantum 
circuits as well. Uniformity tends to be a straightforward technicality, that 
can be worked out after a circuit family is discovered; the discovery of the 
circuit family is usually the interesting part of the algorithm design process. 

Let us now consider the problem of primality testing, where the input 
is a number x represented as an n-bit binary string, and the output is (say) 
1 if x is prime and if £ is composite. Notice that, in the cases where x is 
composite, there is no requirement here that a factor of x be produced. It 
turns out that the smallest currently-known uniform circuit family for this 
problem has size 0(n e logn ) (for some constant d), which is shy of being 
polynomially-bounded || . 

There exist probabilistic circuit families that solve primality testing more 
efficiently. A probabilistic circuit is one that can flip coins during its execution, 
and the evolution of the computation can depend on the outcomes. Formally, 
a jzf (coin-flip) gate, has no input and is understood to emit one uniformly- 
distributed random bit when executed during a computation. If m random bits 
are required then m jzf-gates can be inserted into a circuit. Solovay and Strassen 
[49] discovered a remarkable probabilistic algorithm for primality testing that 



can be expressed in terms of probabilistic circuits. For any e > 0, there is a 
probabilistic circuit of size 0(n 3 log (1/e)) that errs with probability at most 
e. That is, given any x S {0, l} n as input, the circuit correctly decides the 
primality of x with probability at least 1 — e. Note that the error probability is 
with respect to the ^-gates, and not with respect to any assumed probability 
distribution on the input x. The circuit family is highly uniform, and there 
are versions of the algorithm that are quite efficient in practice, even when e 
is very small (such as one billionth). 

As an aside, we note that probabilistic circuit families can be translated 
into standard (deterministic) circuit families if one is willing to forfeit unifor- 
mity. For each n, by setting e = l/(2 n + l), we obtain a probabilistic circuit C n 
of size 0(ra 4 ) for primality testing that errs with probability less than l/2 n for 
any input. Now consider the circuit C' n that results if, for each ^-gate in C n , a 



uniformly distributed random bit is independently generated and substituted 
for that gate. This is a probabilistic construction that yields a deterministic 
circuit C' n . For x £ {0, l} n , let p x be the probability that the resulting C' n 
errs on input x. Then, for each x, p x < 1/2", so the probability that C' n 
errs for any x £ {0, l} n is strictly less than J2xe\o,i} n 1/2 71 = 1- Therefore, 
with probability greater than 0, C' n is correct for all of its 2 n possible input 
values. It follows that, for any n, a deterministic circuit of size 0(re 4 ) must 
exist for primality testing. The problem is that there is no known efficient way 
to explicitly construct the coin flips which yield a correct circuit. Thus, the 
implied 0(ra 4 )-size circuit family for primality testing is merely established by 
an existence proof; this is an example of a non-uniform circuit family. The fact 
that uniform probabilistic circuit families can be converted into non-uniform 
deterministic circuit families is theoretically noteworthy, but not practical. 

A problem that is related to — but different from — primality testing is the 
factoring problem, where the input is an n-bit number x, and the output is 
a list of the prime factors of x. This is apparently much harder than primality 
testing, since the smallest currently-known circuit family for this problem is 

dyjn log n 



probabilistic and has size 0(2 ' ) (where d is a constant) p6| , 41], which 

is far from being polynomially-bounded. One of the reasons why quantum 
algorithms are of interest is that there exists a quantum circuit family of 
polynomial-size that solves the factoring problem (this will be discussed later). 

A problem that is closely related to the factoring problem is the order- 
finding problem, where the input is a pair of natural numbers a and N 
that are coprime (i.e. such that gcd(a, N) = 1), and the goal is to find the 
smallest positive r such that a' r mod N = 1 (there always exists such an r E 
{1, . . . ,N — 1}). It turns out that a polynomial-size circuit family for the 
order-finding problem can be converted into a polynomial-size probabilistic 
circuit family for the factoring problem (and vice versa). In fact, the quantum 
circuit for factoring is actually obtained via this relationship from a quantum 
circuit that solves the order-finding problem. 

Although we have represented circuits pictorially as data-flow diagrams, 
it is useful to be able to encode circuits as binary strings. There are several 
reasonable encoding schemes. One such scheme encodes the graphical struc- 
ture of a circuit C as an m x m adjacency matrix (where m is the number 
of gates plus the number of inputs in C), and then follows this by more bits 
that specify the labels of the nodes (e.g. A, V, ->, Xo, . . . , x n _i). Note that, 
using this encoding scheme, a circuit of size m has an encoding of 0(m 2 ) 
bits. There are more efficient encoding schemes, where the encodings are of 
length 0(m log m), and, for any "reasonable" encoding scheme, the length of 
the string that encodes C is polynomially-related to the size of C. Let e(C) 
denote a binary string that encodes the circuit C (relative to some reasonable 



encoding scheme). 

A fundamental problem in classical computational complexity is the cir- 
cuit satisfiability problem, which is defined as follows. Call a circuit satis- 
fiable if there exists an input string to the circuit for which the corresponding 
output value of the circuit is 1. For example, the circuit in Fig. ffl is satisfiable. 
The input to the circuit satisfiability problem is a binary string x = e(C) that 
encodes some boolean circuit C, and the output is 1 if C is satisfiable, and 
otherwise. The best currently-known (deterministic or probabilistic) algo- 
rithm for circuit satisfiability is to simply try all possible inputs to C. When 
e(C) encodes a circuit C with n inputs and m gates, this procedure takes 
0(2 n m d ) steps, where d is a constant that depends on the encoding scheme 
used (d = 2 suffices for most reasonable encoding schemes). In interesting 
cases, m is typically polynomial in n, so the dominant factor in this quantity 
is 2 n . It is not known whether or not there is a polynomially-bounded cir- 
cuit family for circuit satisfiability. In fact, circuit satisfiability is one of the 
so-called iVP-complete problems ]l9| , |26| ], for which a polynomially-bounded 
circuit family would yield polynomially-bounded circuits for all problems in 
NP. 

1.2 Quantum circuits 

To develop a theory of computational complexity for quantum information, 
it is natural to extend the notion of a circuit to a composition gates which 
perform quantum operations on quantum bits (called qubits). The most general 
quantum operations subsume all classical operations, which are frequently not 
reversible. It turns out that the quantum operations that seem to be the most 
useful in the context of quantum computation are those that are unitary (and 
hence reversible), as well as the von Neumann measurements. 

Let us begin by recalling that the state of a system of m qubits can be de- 
scribed by associating an amplitide a x with each x G {0, l} m (we restrict our 
attention to pure quantum states). Each amplitude is a complex number and 
there is a condition that J2 x e\o,i} m \ a x\ 2 = 1- Taken together, these ampli- 
tudes constitute a point in a 2 m -dimensional vector space. The computational 
basis associated with this vector space is {\x) : x E {0, l} m }, and we follow the 
convention of writing states as linear combinations of these basis elements: 

J2 a *\ x ) ■ (!) 

x-e{o,i} m 

Given a quantum state, it is impossible to access the values of the amplitudes 
directly. What one can do is perform a (von Neumann) measurement on each 
qubit. If such an operation is performed then the result is a random m-bit 
string y, distributed as Pr[y = x] = \a x \ 2 , for each x £ {0, l} m . After this 
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measurement, the original quantum state is destroyed. One can also perform 
a unitary operation on an m-qubit system, which is a linear transformation U 
for which UU' = I, where W is the conjugate transpose of U . Such a unitary 
transformation can be represented by a 2 m x 2 m matrix and will, in general, 
affect all of the m qubits. 

For the purposes of quantum computation, we restrict the basic operations 
to local unitary transformations that only involve a small number (say, one 
or two) of the qubits. A one-qubit unitary operation can be described by 
2x2 unitary matrix U. In the case where m = 1, this U transforms the state 
a|0) +/3|1) to the state a' |0) +/3'|1), where 

i) - <; 

In order to define the semantics of applying a one-qubit gate in the context 
of an m-qubit system for m > 1, we introduce a tensor product operation. 
Suppose that an m-qubit system is in state J2xe{o,i} m a x \x) an d an n-qubit 
system is in state X)«g{o,i} n Py \v)- Then the state of the combined system 
(consisting of m + n qubits) is defined to be the tensor product of the states 
of the individual systems, which is 

X <** Is) I I Z! Pv \y)\ = Z a *Py \ x y) ■ ( 3 ) 

V xe{0,l}- / \yt{0M n ) * s{ °nHT 

J/S{0,1}™ 

For example, (^|0)-^|1))(^|0)-^ |l)) = i|00)-i|01)-i|10) + i |11). 

Now, applying a one-qubit U to the k th qubit of an m-qubit system is defined 
to be the unitary operation that maps each basis state 

\XQ ■ ■ ■ X m -l) = \xo ■ ■ ■ Xk-2) \xk-l) \xk~-- X m -l) 

to the state 

\X0 ■ ■ ■ X k -2) (U \x k -l}) \x k '-- X m -l) 

(for each x G {0, l} m ). Note that, by linearity, this completely defines a 
unitary operation on an m-qubit system. 

For example, the one-qubit Hadamard gate corresponds to the matrix 

" - Ml -!)• « 

and, when it is applied to the second qubit of a two-qubit system, the resulting 
operation is 

/l 1 0\ 

1-10 
11 

Vo o i -\) 
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x/2 



(5) 



(with respect to the ordering of basis states |00), |01), |10), |11)). A quantum 
circuit corresponding to such an operation is in Fig. |3], which denotes that the 
first (top) qubit is left unaltered and H is applied to the second qubit. 



H 



Figure 3: Quantum circuit applying a Hadamard gate to one of two qubits. 

To construct nontrivial quantum circuits, it is necessary to include two- 
qubit unitary operations. A simple but quite useful two-qubit operation is 
the CONTROLLED-NOT gate (c-not, for short), which, for all x,y € {0,1}, 
transforms the basis state |x) \y) to the basis state |x) \y © x) (and this extends 
to arbitrary quantum states by linearity). The notation for the c-not gate 
in quantum circuits is indicated in Fig. ||| (it is also known as the "reversible 
exclusive-or" gate). 



-©- 



Figure 4: Notation for the controlled-not (c-not) gate. 
Note that the c-not gate corresponds to the unitary transformation 
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(6) 



The semantics of the c-not gate extends to the context of m-qubit systems 
with m > 2 in a manner similar to that of the one-qubit gates. 

For basis states \x) \y), the effect of the c-not gate is essentially the same 
as the classical two-bit gate that maps (x, y) to (x, x © y) (for all x, y G 
{0,1}). This gate negates the second bit conditional on the first bit being 
1. For arbitrary quantum states, the behavior of this gate is more subtle. 
For example, although the classical gate never changes the value of its first 
"control" bit, the quantum gate sometimes does: applying the c-not gate to 



state {j- |0)-^ |l))(i |0>-^ |1» yields the state (j. |0) + ^ |1))(^ |0>- 

75 I 1 ))- 

A more general kind of two-qubit gate is the controlled-^/ gate, where 

U is a 2 x 2 unitary matrix. This gate maps |0) \y) to |0) |y) and |1) \y) to 

|1) (U \y)) (for all y £ {0, 1}), and is denoted in Fig. |5J. 



U 



Figure 5: Notation for a controlled-£/ gate. 
Note that the c-not gate is a special case of a controlled-?/ gate with 

"-{ID « 

(and this U itself is essentially a not gate). 

Now, suppose that we want to compute the AND of two bits (i.e. take xq and 
X\ as input and produce xq A x\ as output) using only the one- and two-qubit 
gates of the above form. This can be done in a manner that avoids irreversible 
operations via the quantum circuit in Fig. || where H is the Hadamard gate 
(Eq. |) and 



V 



(where i 



1 
i 



(8) 



H — V 



-# 



-# 



Vt 
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Figure 6: Quantum circuit simulating a C 2 -not (Toffoli) gate. 

For any xq, xi, y E {0, 1}, setting the initial state of the qubits to \xo) \x±) \y) 
and tracing through the execution of this circuit reveals that the final state 
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is \xq) \x{) \y (B (xq Axi)). Thus, when y is initialized to 0, the final state 
of the third qubit is |xo A x\) (and the explicit classical data, xq A x±, can 
be extracted from this quantum state by a measurement). The three-qubit 
operation that is simulated in Fig. J6| is a so-called Toffoli gate (also called 
a CONTROLLED-CONTROLLED-not, or c 2 -NOT for short). See || ^3|, ^J for 
some similar constructions. 

For classical circuits, there are finite sets of gates which are universal in the 
sense that they can be used to simulate any other set of gates. For quantum 
circuits, the situation is different, since the set of all unitary operations is 
continuous, and hence uncountable — even when restricted to one-qubit gates. 
If one starts with any finite set of quantum gates then the set of all unitary 
operations that can implemented is limited to some countable subset of all the 
unitary operations. In spite of this, there are meaningful ways to capture the 
important features associated with universal sets of gates. 

First, it turns out that there are infinite sets consisting of one- and two- 
qubit of gates that are universal in the exact sense. For example, if the C- 
NOT gate as well as all unitary one-qubit gates are available then any fc-qubit 
unitary operation can be simulated with 0(4 k k) such gates J3], |30|. Therefore, 
the overhead is constant when switching between different universal sets of 
local unitary gates (such as the set of all two-qubit gates and the set of all 
three-qubit gates). 

Moreover, there are finite sets of one- and two-qubit gates that are univer- 
sal in an approximate sense. The aforementioned one-qubit Hadamard gate H 
(Eq. f|) and the two-qubit controlled-^ gate (where V is defined in Eq. ||) 
are an example of such a set. The precise result is best stated as a theorem. 



Theorem 1 ( ||33| , |48|| ) Let B be any two-qubit gate and e > 0. Then there 
exists a quantum circuit of size 0(log (1/e)) (where d is a constant) consist- 
ing of only H and controlled -V gates which computes a unitary operation 
B' that approximates B in the following sense. There exists a unit complex 
number A (i.e. with |A| = 1) such that \\B — XB'\\2 < e. 

In the above theorem, || • ||2 is the norm induced by Euclidean distance and 
A is a "global phase factor" (which can be disregarded). Consequently, if B' 
is substituted for B in some quantum circuit then the final state J2x a 'x I x ) 
approximates the final state of the original circuit J2 X a x \x) in the sense that 
\/J2x \^ a 'x ~ a x\ 2 < £• This implies that if the final state is measured then the 
probability of any event among the possible outcomes is affected by at most e. 
The proof of Theorem |l] exploits the fact that the commutator of two unitary 
operators is not generally / (the identity operator), but it can converge very 
quickly to / (see |33|, |8| for details). 
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An example of another finite set of gates that is universal in the approxi- 
mate sense is: H , W, and C-NOT, where 

1 

e i7r / 4 



W 



(9) 



In fact, with W and c-not gates, one can simulate a controlled-^ gate, 
as shown in Fig. [7| (see also Q ) . 
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Figure 7: Simulation of a controlled-]/ gate (note: W^ = W 7 ). 

As in the classical case, the measure of computational complexity for quan- 
tum circuits is most interesting when large problems that scale up are consid- 
ered. Using sets of gates that are universal in the exact sense, computational 
complexity can vary only by constant factors. On the other hand, using sets of 
gates that are universal in the approximate sense, computational complexity 
can vary by at most poly logarithmic factors: any circuit with m gates can be 
simulated within accuracy e by a circuit in terms of a different set of basic 
operations with (3(m log (m/s)) gates. This is accomplished by simulating 
each of the m gates of the original circuit within accuracy e/m, which results 
in a total accumulated error bounded by e. 

For example, consider the problem where the input is xq,x\, . . . , £ n _i and 
the goal is to compute the conjunction xq A x\ A • • • A x n —x- In terms of H, 
W, and C-not gates, the computational complexity can be shown to be G(n), 
and, with another set of approximately uniform gates, the complexity may be 
different, but it will remain between Q(n) and 0(ralog (n/e)) (where d is some 
constant and e is the accuracy level required). 

Since it seems inconceivable that it would ever be possible to physically 
implement quantum gates with perfect accuracy, the need to ultimately work 
with approximations of quantum gates is inevitable. Fortunately, due the 
unitarity of quantum operations, inaccuracies only scale up linearly with the 
number of gates involved in a circuit. And, if one employs quantum error- 
correcting codes and fault-tolerant techniques then even gates with constant 
inaccuracies (and that are subject to "decoherence" ) can in principle be em- 
ployed in arbitrarily large quantum circuits |l], 31, E5[ (see [42] for an extensive 
review) . 

For quantum circuit families, we must also consider the issue of uniformity: 
a legitimate quantum circuit family should be finitely specifiable in a compu- 
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tationally efficient way. This can be defined as a straightforward extension of 
the uniformity definitions for classical circuit families, where the specification 
algorithm is classical and a finite set of gates that is approximately universal 
(such as H, W, and C-NOt) is used. All quantum algorithms proposed to date 
can be expressed as circuit families that are uniform in this sense (see p9| for 
further comments). 

Notwithstanding the above issues, a convenient practice is to allow perfect 
universal sets of gates, bearing in mind that: (a) they can always be approxi- 
mated using any finite set of gates that is approximately universal with only 
a polylogarithmic penalty in the circuit size (even if the implementations of 
these gates are approximate); and (b) uniformity tends to be a straightforward 
technicality (at least with the quantum algorithms discovered so far). 

Perhaps the most remarkable quantum algorithm that has been discovered 
to date is the factoring algorithm, due to Peter Shor [44]. 



Theorem 2 ( |44f ) There exists a quantum circuit family of size 0(n 2 log (n/e)) 
that solves the factoring problem within accuracy e (for some constant d). 

Note that this circuit size is essentially exponentially smaller than the most effi- 



cient known classical probabilistic circuit for factoring (whose size is 0(2 ™ '")). 
The quantum factoring algorithm actually follows from an algorithm for the 
order-finding problem, which in turn evolved from an algorithm in the query 
complexity model (explained in the next section). 

The above result shows that, based on our current state of knowledge, 
quantum algorithms may be exponentially more efficient than classical algo- 
rithms for some problems. The next result shows that the gain in computa- 
tional efficiency cannot exceed one exponential. 

Theorem 3 For any S(n)-qubit quantum circuit with T(n) gates there is a 
classical probabilistic circuit with 0(2 s ( n 'T(n) 3 log (1/e)) gaiesQ that simulates 
it within accuracy e in the following sense. After measuring the final state of 
the quantum circuit, the probability of any event among the outcomes differs 
from that of the classical circuit by at most e. 

The idea behind the proof of Theorem [| is to store the values of all 2 ( n > ampli- 
tudes associated with an S'(n)-qubit quantum system in classical bits (to an ap- 
propriate level of precision) . Then, for each of the T(n) gates, these amplitudes 
are updated to reflect the effect of the gate. At the end, the absolute value of 
the square of each amplitude is computed and the resulting probability distri- 
bution is sampled by using ^-gates. To obtain the upper bound in Theorem ||, 
it suffices to store each amplitude with T(n) + log (1/e) bits of precision, which 

lr The T(n) 3 log 2 (l/e) factor can be replaced by a smaller but more complicated expression. 
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requires 0{2 s ^ n ' (T(n) +log(l/e))) bits in all. Since the effect of each quantum 
gate corresponds to multiplying the amplitude vector by a sparse 2 s ^ n > x 2 s ^ n > 
matrix, this entails 0(2 s ( n >) arithmetic operations, which can be implemented 
by 0(2 ( n '(T(n) + log(l/e)) 2 ) bit operations. Thus, the total number of clas- 
sical gates is 0{2 s ^T{n){T{n)+\og(l/e)) 2 ) C O {2 s » T(n) 3 log 2 (1/e)). Also, 
the measurement process can be simulated with 0(2 ( n 'T(n) 2 log (1/e)) clas- 
sical gates. 

A more refined argument than the one above can be used to show that 
an S'(n)-qubit circuit with T(n) gates can be simulated using space that is 
polynomial in S(n) and T(n) (but still with an exponential number of oper- 
ations), and there are also more esoteric computational models that subsume 
the power of quantum circuit families |^5| . 

Regarding the circuit satisfiability problem, it is currently unknown whether 
or not there exists a polynomially-bounded quantum circuit family that solves 
it. What is known is that quantum algorithms can solve this problem quadrat- 
ically faster than the best currently-known classical algorithms for this prob- 
lem. 

Theorem 4 There exists a quantum circuit family of size 0(^2 n log(l/e)m ) 
that solves the circuit satisfiability problem within accuracy e (for some con- 
stant d). Here, n and m measure the size of the input instance: n is the 
number of inputs to circuit C and m is the number of gates of C. 

Note how this compares with the best currently-known classical circuit 
family for the circuit satisfiability problem, which has size 0(2 n m d ). Both 
quantities are exponential, but \/2™ is nevertheless considerably smaller than 
2 n for large values of n. The quantum algorithm is a consequence of a re- 
markable algorithm in the query complexity model that was discovered by 
Lov Grover [^7]] (explained in the next section). 

2 Query complexity 

This is an abstract scenario which can be thought of as a game, like "twenty 
questions" . The goal is to determine some information by asking as few ques- 
tions as possible. This differs from the computational complexity scenario in 
that the "input" is not presented as a binary string at the beginning of the 
computation. Rather, the input can be thought of as a "black box" comput- 
ing a function f : S — > T, and the basic operations are queries, in which the 
algorithm specifies a t from the domain of the function and receives the value 
/(£) in response. 
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A natural example is that of "polynomial interpolation", where / is an 
arbitrary polynomial of degree d 

f(t) = c + Cl t + --- + c d t d (10) 

and the goal is to determine the coefficients Co, ci, . . . , cy. It is well known that 
d+ 1 queries to / are necessary and sufficient to accomplish this. 

In the classical case, an algorithm in this model can be represented by a 
circuit consisting of gates from some standard universal set (e.g. A, V, -i) plus 
additional gates to perform queries. For / : S — » T, an f -query gate takes 
t S S as input and produces f(t) as output. In this scenario, since there are no 
input bits related to the problem instance (the problem instance is embodied 
in /), the inputs to the circuit are all set to constant values (such as 0). 

In order to be able to adapt this model to settings involving quantum 
information, we slightly modify the form of the query gates so that they 
are reversible. For example, for / : {0, 1}" — > {0, 1}, define a reversible f- 
query gate as the mapping / : {0, 1}" x {0, 1} — » {0, 1}™ x {0, 1} such that 
f(x,y) = {x,y® f[x)) (for x G {0, l} n and y G {0, 1}). Note that, for classical 
algorithms, reversible /-queries yield exactly the same information as the non- 
reversible kind. Any circuit that makes reversible /-queries can be converted 
into one that makes exactly the same number of non-reversible /-queries (and 
vice versa). Henceforth, all queries will be assumed to be in reversible form. 

In the quantum case, an /-query is a unitary transformation that per- 
mutes the basis states according to the classical mapping determined by / (in 
reversible form). For example, for / : {0, 1}™ — ► {0, 1}, an /-query gate is the 
unitary transformation that maps \x) \y) to \x) \y © f(x)) (for all x E {0, l} n 
and y £ {0, 1}). One way of denoting /-queries in both classical and quantum 
circuits is shown in Fig. |8| (for the case where / : {0, l} 2 — > {0, 1}). 



/ 



— © — 

Figure 8: Notation for an /-query gate when / : {0, l} 2 — > {0, 1}. 
The first instance where a quantum algorithm was proven to outperform 



any classical algorithm was with Deutsch's problem |21], where / : {0, 1} 



{0, 1} and f{t) = (co + c±t) mod 2, and the goal is to determine the value of 
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c\ (note that c\ = /(0) © /(l))- A classical circuit (in reversible form) that 
computes c\ with two /-queries is shown in Fig. 0. 
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Figure 9: Classical circuit for Deutsch's problem using two queries. 

The inputs to the circuit are both initialized to 0, and the unary © operation 
between the two /-queries is a not gate. It is easy to see that the final 
values of the two bits are 1 and c\. It can also be shown that no classical 
algorithm exists that computes c\ with a single /-query (since it is impossible 
to determine /(0) © /(l) from just /(0) or /(l) alone). 

But the quantum circuit in Fig. [l0| |l8|, ^] computes c\ with a single 
/-query gate. 
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Figure 10: Quantum circuit for Deutsch's problem using one query. 

Here the initial state of the two-qubit system is |0) |1) and its final state is 
(— l) c ° |ci) |1), which yields c\ when the first qubit is measured. 

Query complexity can be pinned down more precisely than computational 
complexity in that the "number of /-queries" is not sensitive to arbitrary tech- 
nical conventions. So, it makes sense to consider the exact query complexity 
of a problem independent of linear factors, and to say that the classical query 
complexity of Deutsch's problem is two, whereas its quantum query complexity 
is one. 

Although the above advantage is small, there are generalizations of Deutsch's 
problem for which the discrepancy between classical and quantum query com- 
plexity is much larger. One of these is Simon's problem p3J, which is defined 
as follows. For a function / : {0, l} n — ► {0, l} n , define s £ {0, l} n to be an 
XOR-mask of / if: f(x) = f(y) if and only if x © y E {0 n ,s} (where © is 
defined over {0,1}™ x {0, l} n bitwise). When s = n , f is a bijection, and 
when s ^ n , / is a two-to-one function with a special structure related to 



15 



s. In Simon's problem, / : {0, 1}™ — » {0, l} n is promised to have an XOR- 
mask s £ {0, l} n , and the goal is to find s by making queries to /. In this 
case, an /-query is the mapping (x,y) i— > (x,y (B f{x)) in the classical case 
and \x) \y) i— > \x) \y © f{x)) in the quantum case (x,y £ {0, l} n ). Note that 
Deutsch's problem is the special case of Simon's problem where n = 1 (the 
XOR-mask is -\c\ in this case). 

It can be proven that any classical algorithm in the query model for Si- 
mon's problem must make 0(\/2 n log(l/e)) queries to /, even for probabilistic 
circuits with query gates that are permitted to err with probability up to e. 
On the other hand, there is a simple quantum circuit that solves this problem 



with only 0(n log(l/e)) queries to / (see [46] for the details). There is also a 
refinement to Simon's original algorithm that makes a polynomial number of 
queries and solves Simon's problem exactly ||Tl| . 

Although the primary resource under consideration is the number of queries, 
the number of auxiliary operations (i.e. the non-query gates) is also of interest, 
and it is desirable to bound both quantities. For Simon's algorithm the total 
number of gates is 0(n 2 log(l/e)). 

Simon's problem demonstrates that, in the query complexity setting, there 
are quantum algorithms that are exponentially more efficient than any classical 
algorithm. Although the query complexity scenario is somewhat abstract, the 
significance of algorithms in this model will become clear when we examine 
the consequences of the order-finding problem in the query scenario, 
which is defined as follows. Let N be an n-bit integer and a £ {1, . . . , N — 1} 
be a number such that gcd(a, N) = 1. In this version of the order-finding 
problem, the function f a>N : {0, l} n x {0, l} n -» {0, l} n x {0, l} n is defined as 

fa,N(x,y) = (x,(a x y) mod N). (11) 

This is invertible if y is restricted to {0, ... ,N — 1} (and can be extended 
to be invertible over its full domain by defining f a ^(x,y) = (x,y) for the 
case where N < y < 2 n ). The goal is to find the minimum r £ {1, . . . ,N — 
1} such that a r mod A = 1 by making queries to f a ^ (in this case, / a> jv 
is already in reversible form). Although there is no polynomially-bounded 
classical circuit that solves this problem, Shor [Q] discovered a quantum circuit 
that solves it with probability 1 — e using only (9(log(l/e)) queries to f a ^ and 
0(n 2 log (n/e)) auxiliary gates (for some constant d). Detailed explanations 



of the algorithm can be found in several sources, including [18, [3^, 44]. 

A significant property of the function f a ^ is that there exists a classi- 
cal circuit of size 0(n 2 log n log log n) that takes N (an n-bit number), a G 
{1, . . . , N — 1} (such that gcd(a, N) = 1), and x,y £ {0, l} n as input, and pro- 
duces f a> N (x, y) as output. In other words, given a and N, one can efficiently 
simulate an / a ,Af-query gate. Moreover, this simulation can be implemented in 
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terms of quantum gates, such as not, c-not, and c 2 -NOT (using techniques 
for reversible classical computation ||). By doing this simulation for each 
/a,iv-query gate in the quantum circuit for the order-finding problem, one ob- 
tains a quantum circuit of size 0(n 2 log (n/e)) that takes a and N as input 
and produces the minimum positive r such that a r mod N = 1 as output with 
probability 1 — e. Thus, the algorithm in the query complexity model yields 
an algorithm in the computational complexity model for order-finding — and 
hence also for factoring. This is a specific instance of the following general 
result relating algorithms in the query complexity model to algorithms in the 
computational complexity model. 

Theorem 5 Suppose that a function f z : {0, l} m — ► {0, l} k is associated with 
each z E {0, l} n (where m and k are functions of z), and that the classical 
computational complexity of the function that maps (z,x) to f z (x) is bounded 
above by R(ri). Suppose also that there is a problem in the query complexity 
model where some property P{f z ) is to be determined in terms of f z - queries, 
and that there is a quantum circuit that solves this problem using S(n) queries 
to f z and T(n) auxiliary operations. Then the quantum computational com- 
plexity of the problem where the input is z £ {0, l} n and the output is the value 
of the property P(f z ) is 0(R(n)S(n) + T(n)). 

The circuit for the computational complexity problem is merely the circuit 
for the query complexity problem with a circuit simulating each / 2 -query gate 
substituted for that / z -query gate. 

A simple problem that seems natural in the query scenario is the search 



problem [27|, where / : {0, l} n —> {0, 1}, and the goal is to find aniG {0, l} n 
such that f(x) = 1 (or to indicate that no such x exists). Any classical 
algorithm for this problem must make 0(2"') /-queries, even if it is allowed 
to err with probability (say) |. Lov Grover (27| discovered a remarkable 
quantum algorithm that accomplishes this with 0(y2™) queries (some detailed 
explanations of the algorithm are found in [||, 27, 37]). Grover 's result, with 
some later refinements [g, [l(], 14, 37, |54| incorporated into it, is summarized 
as follows. 



Theorem 6 (|]27|) There is a quantum algorithm that solves the search prob- 
lem for f : {0, l} n — ► {0, 1} with 0(^/2™ log(l/e)) queries to f, and errs with 
probability at most e. 

The efficiency of the above algorithm has been shown to be optimal |], ||, 

Clearly, Grover's algorithm can also be used to solve the existential 
search problem, where the goal is just to determine whether or not there 
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exists an x G {0, l} n such that f(x) = 1 (a problem that also requires f2(2 n ) 
queries in the classical case). Note the similarity between this existential 
search problem and the circuit satisfiability problem. In fact, using Theo- 
rem |H, this algorithm in the query model translates into the algorithm for 
the circuit satisfiability problem that is claimed in Theorem El The input is 
e(C), an encoding of a circuit C with m gates and n inputs that computes a 
mapping C : {0, l} n — > {0, 1}, and the output should be 1 if there exists an 
x £ {0, l} n such that C(x) = 1, and otherwise. The mapping that takes 
(e(C),x) to C(x) can be computed by a classical circuit with 0(m d ) gates 
(where d is a constant that depends on the encoding scheme, and is usually 
small). Also, the algorithm in Theorem || makes 0(^2 n log(l/e) n) auxiliary 
operations. Therefore, applying Theorem [5], one obtains a quantum circuit of 
size 0(^2™ log(l/e) m d ) for the circuit satisfiability problem. 

Let us now consider some variations and extensions of the existential search 
problem in the query model. We shall henceforth refer to the existential search 
problem as OR, defined as 

OR(f) = (3x)/(x), (12) 

where / : {0, l} n — > {0, 1} is accessed through /-queries. The name OR seems 
natural since 

OR(f) = /(00---0)V/(00---l)V---V/(ll"-l). (13) 

Note that the complementary problem AND(f) = (\/x)f(x) has computational 
complexity somewhat similar to that of OR, since (Vx)/(x) = -i(3x)-i/(x). 

Some non-trivial extensions of OR and AND are the alternating quanti- 
fier problems, such as OR- AND, where there are two alternating quantifiers: 

OR-AND(f) = (3x!)(Vx 2 )/(xi,x 2 ). (14) 

Here, / : {0, l}™ 1 x {0, l}™ 2 — > {0, 1}, and n\,ri2 are implicit parameters satis- 
fying m +ri2 = n. By a suitable recursive application of Grover's algorithm for 
OR, this problem can be solved with 0{^j2 n nlog(l/e)) queries to / [|1| (the 
extra factor of \fn is to amplify the accuracy of the bottom level algorithm 
for AND). In fact, one can extend the above to k alternations of quantifiers: 

OR-AND Q(f) = (3x 1 )(\/x2)---(Qx k )f(x 1 ,x 2 ,...,x k ), (15) 

where Q £ {OR, AND} and Q G {3,V}, depending on whether k is even or 
odd, and / : {0, l}™ 1 x • • • x {0, l} nfc — » {0, 1} with n\-\ n k = n. In this case, 

the recursive application of Grover's technique makes 0{J2 n n k ~ 1 \og{l/e)) 
queries to / (see Jl3|]; also |E3] for a related result). 
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For all of these variations of OR and AND, it can be shown that any 
classical algorithm for one of these problems must make 0(2 n ) queries, and 
the quantum algorithms for these problems are all nearly quadratically more 
efficient than this in the sense that they make 0((2 n ) ' 2+ ) queries, for any 
5 > and e > 0. In fact, even if k, the number of alternations of OR 
and AND, is set to 5n/2\ogn (instead of being held constant), the quantum 
algorithms make 0((2 n ) 1 ' 2+ ) queries. All of these quantum algorithms also 
have counterparts for the corresponding problems in the computational model, 
where the function is specified by an encoding e(C) of a circuit C. 

Another problem that has a similar flavor to these problems is the parity 
problem (in the query scenario), defined as 

PARITY (f) = I J2 ^ x ) I mod 2 - ( 16 ) 

Ve{o,i}™ / 

It can be shown that any classical algorithm requires 0(2") queries to solve 
PARITY , and it is natural to ask whether quantum algorithms can be quadrat- 
ically more efficient — or even 0((2™) r ), for some r < 1. One of the applications 
of the communication complexity model (explained in the next section) is to 
show that this is not possible: at least Q(2 n /n) queries must be made by any 
quantum algorithm. In fact, a stronger lower bound of i2 n is also known 
[||, |24| (using different methods). 

It is important to note that, although upper bounds in the query model 
translate into upper bounds in the computational model, the converse of this 
need not be true. For example, it is conceivable that there is a polynomially- 
bounded circuit that solves the circuit parity problem, where the input is 
e(C), an encoding of a circuit C that computes a function /, and the output 
is PARITY {f). Note how this latter problem is different from another version 
of the parity problem in the computational scenario (discussed in Section 1.1), 
where the inputs are xq,x\, . . . ,x n -\, and the goal is to compute xq © x\ © 
• • • © x n _i. 

3 Communication complexity 

In this model, there are two parties, traditionally referred to as Alice and Bob, 
who each receive an n-bit binary string as input (x = xqX\ . . . x n _i for Alice 
and y = yoyi ■ ■ ■ y n -l for Bob) and the goal is for them to determine the value 
of some function of the of these 2n bits. The resource under consideration here 
is the communication between the two parties, and an algorithm is a protocol, 
where the parties send information to each other (possibly in both directions 
and over several rounds) until one of them (say, Bob) obtains the answer. This 
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model was introduced by Yao [ET1 and has been widely studied in the classical 



context (see [35| for a survey). 

An interesting example is the equality problem, where the function is 
EQ, defined as 

EQ(x,y) = (1 [ ' x = y 

[0 liifi/. 

A simple n-bit protocol for EQ is for Alice to just send her bits xo, • • ■ , x n -i 
to Bob, after which Bob can evaluate the function by himself (in fact, there 
is a similar n-bit protocol for any function). The interesting question is 
whether or not the EQ function can be evaluated with fewer than n bits 
of communication — after all, the goal here is only for Bob to acquire one bit. 
The answer depends on whether or not any error probability is permitted. 

If Bob must acquire the value of EQ(x, y) with certainty then it turns out 
that n bits of communication are necessary. Note that Alice sending the first 
n — 1 bits of x will clearly not work, since the answer could critically depend 
on whether or not x n _i = y n ~i- The number of possible protocols to consider 
is quite large and an actual proof that n bits communication are necessary is 
nontrivial. The interested reader is referred to |jjj] for a proof. 

On the other hand, for probabilistic protocols (where Alice and Bob can 
flip coins and base their behavior on the outcomes), if an error probability of 
e > is permitted then 0(log(n) log(l/e)) bits of communication are sufficient. 
As usual, we are not assuming anything about a probability distribution on 
the input strings; the error probability is with respect to the random choices 
made by Alice and Bob, and it applies regardless of what x and y are. 

We now describe an 0(log(n) log(l/e))-bit protocol for EQ. First of all, 
Alice and Bob agree on a finite field whose size is between 2n and 4n (such 
a field always exists, and its elements can be represented as 0(log(n))-bit 
strings). Now, consider the two polynomials 

Px(t) = X + Xl t + • • • + X n ^if 1 - 1 (17) 

Py (t) = yo + yrf + '-' + y^e- 1 . (18) 

For any value of t in the field, Alice can evaluate p x (t) and Bob can evaluate 
p y (t). If x = y then the two polynomials are identical, so p x {t) = P y (t) for 
every value of t. But, if x ^ y then, since p x (t) and p y {t) are polynomials 
of degree n — 1, there can be at most n — 1 distinct values of t for which 
Px(t) = Py(t)- Therefore, if a value of t is chosen randomly from the field then 
the probability that p x (t) = p y (t) is at most g- Now, the protocol proceeds as 
follows. Alice chooses k = log(l/e) independent random elements of the field, 
ti,...,tk, and then sends ti,...,tk andp x (ti), ... ,p x (tk) to Bob (this consists 
of 0(log(n) log(l/e)) bits). Then Bob outputs 1 if and only if p x (ti) = p y (U) 
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for all i £ {1, . . . , k}. The probability that Bob erroneously outputs 1 when 
x ^ y is at most l/2 fc = e. 

Two other interesting communication complexity problems are the inter- 
section problem, where the function is IN, defined as 

IN(x,y) = (x Ay )V(xi Ayi)V---V(x n _iAy n _i) (19) 

and the inner product problem, where the function is IP, defined as 

IP(x,y) = (x Ay )e(xi A yi) e • • • © (x„-i Ay„_i). (20) 

Intuitively, for IN, the inputs x and y can be thought as encodings of two 
subsets of {0, ...,ro—l} and the output is a bit indicating whether or not they 
intersect. Also, IP is the inner product of x and y as bit vectors in modulo 
two arithmetic. The deterministic communication complexity of each of these 
problems is the same as that of EQ: any deterministic protocol requires n 
bits of communication. Also, it has been shown that both of these problems 
are more difficult than EQ when probabilistic protocols are considered: any 
probabilistic protocol with error probability up to (say) 3 requires Q(n) bits 



of communication (see [15] for IP, and p{J for IN; also |35||), 

It is natural to ask whether any reduction in communication can be ob- 
tained by somehow using quantum information. Define a quantum commu- 
nication protocol as one where Alice and Bob can exchange messages that 
consist of qubits. In a more formal definition of this model, there is an a pri- 
ori system of m qubits, some of them in Alice's possession and some of them 
in Bob's possession. The initial state of all of these qubits can be assumed to 
be |0), and Alice and Bob can each perform unitary transformations on those 
qubits that are in their possession and they can also send qubits between them- 
selves (thereby changing the ownership of qubits). The output is then taken 
as the outcome of some measurement of Bob's qubits. Various preliminary 
results about communication complexity with quantum information occurred 



in n, m, m §|, H| 



There are fundamental results in quantum information theory which imply 
that classical information cannot be "compressed" within quantum informa- 
tion p8| . For example, Alice cannot convey more than r classical bits of 
information to Bob by sending him an r-qubit message. Based on this, one 
might mistakenly think that there is no advantage to using quantum informa- 
tion in the communication complexity context. In fact, there exists a quantum 
communication protocol that solves IN whose qubit communication is approx- 
imately the square root of the bit communication of the best possible classical 
probabilistic protocol. 
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Theorem 7 (|]13|) There exists a quantum protocol for the intersection prob- 
lem (IN) that uses 0(yJnlog(l/e) log(n)) qubits of communication and errs 
with probability at most e. 

Moreover, the quantum protocol can be adapted to actually find a point in 
the intersection in the cases where IN[x,y) = 1. That is, to produce an 
i G {0, . . . , n — 1} such that X{ A yi = 1. This problem, like IN, has classical 
probabilistic communication complexity O(n). 

To understand the protocol in Theorem 0, it is helpful to think of the inputs 
x and y as functions rather than strings, and we introduce some notation that 
makes this explicit. For convenience, assume that n = 2 for some k (if not 
then x and y can lengthened by padding them with zeroes), and define the 
functions f x , f y : {0, l} k -> {0, 1} as 



fxii) = Xi (21) 

fy(i) = Vi (22) 

where {0, 1} and {0, 1, . . . , 2 — 1} are identified in the natural way. Alice 
and Bob's input data can be thought of as f x and f y , rather than x and y 
(respectively). In particular, given x, Alice can simulate an / x -query that maps 
\i) \j) to \i) \j © fx(i)) (f° r an i £ {0, l} fc and j G {0, 1}), and Bob can simulate 
/^-queries. (Although the resource that is of interest in this model is not the 
number of basic operations that Alice and Bob perform, it is worth noting that, 
Alice and Bob's simulations of these queries can be explicitly implemented by 
reversible circuits with 0(2 k) = 0(nlog(n)) basic operations). 

To construct an efficient quantum protocol for IN, define the function 
/.A/»: {0, l} k -► {0, 1} as (f x A /„)(») = f x (i) A f y (i) (for i G {0, \} k ), and 
note that IN(x,y) = OR(f x A f y ). Therefore, if Alice and Bob can somehow 
perform (f x A / y )-queries then the value of IN(x, y) can be determined by 



making 0(J 2 k \og(l/e)) = 0(>/n log(l/e)) such queries. The problem is that 
neither Alice nor Bob individually have enough information to perform an 
{fx A / y )-query (since this depends on both x and y). If Alice were to begin 
by sending x to Bob then Bob could make (f x A / y )-queries on his own, but 
note that this entails n bits of communication to begin with. Another, more 
efficient, approach is for Alice and Bob to collectively simulate (f x A f y )- 
queries by combining /^-queries (which Alice can perform) with /^-queries 
(which Bob can perform), and a small amount of communication. To see how 
this is accomplished, consider the circuit in Fig. [II]. First, ignoring the broken 
vertical lines, note that the quantum circuit (composed of two /^-queries, two 
/^-queries, and one Toffoli gate) is equivalent to an (f x A/ 3/ )-query. That is, it 
implements the unitary transformation that maps the state \i) |0) |0) \j) to the 
state |») |0) |0) \j © (/-,. A /„)(*)) (for all i G {0, l} k , j G {0, 1}). This circuit 
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Figure 11: Simulation of an (/^ A / y )-query in terms of /^-queries and /^-queries. 



uses two extra qubits that are each initialized in state |0) and which incur no 
net change. 

Now, the protocol for IN can be thought of as Bob executing the algorithm 
in the query model for OR with the function f x A f y , except that, whenever 
an (fx A / y )-query gate arises, he interacts with Alice to simulate the circuit 



in Fig. 11: first Bob performs an / y -query gate, then he sends the A; + 3 qubits 
to Alice who performs some actions involving /^-queries and a Toffoli gate 
(shown between the two broken lines) and sends the qubits back to Bob, who 
performs another f y -queiy. Note that the total amount of communication 
that this entails is 2(k + 3) G O(logn) qubits. Therefore, the total commu- 
nication for Bob's simulation of the 0(^/n\og{lje)) queries to (f x A f y ) is 
0(\/n log(l/e) log(n)), as claimed in Theorem 0. 

More recently, Ran Raz has given an example of a communication com- 
plexity problem which a quantum protocol can solve with exponentially less 
communication than the best classical probabilistic protocol. The description 
of the problem is more complicated than EQ, IN, and IP, and the reader is 



referred to [43 1 for the details. 

The methodology used to establish Theorem R] involved the conversion of 
an algorithm in the query model (for OR) to a communication protocol (for 
IN(x,y) = OR(f x A f y ))- This conversion can be stated in a more general 
form. 



Theorem 8 (Jl3|) Suppose that there is a quantum algorithm in the query 
model that computes P(f) in terms of T{k, e) queries to f , where f : {0, l} k — > 
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{0, 1}, and e is a bound on the error probability. For n = 2 k , define the com- 
munication problem P A : {0, l} n x {0, l} n — > {0, 1} as P A (x, y) = P{f x A f y ). 
Then there is a quantum protocol that solves P A with 0(T(log(n),e) log(n)) 
qubits of communication. And a similar result holds for P v (x, y) = P(f x V fy) 
and P @ (x,y) = P{f x f y ). 

We conclude with a discussion of the quantum communication complexity 
of the inner product function IP. It has been shown |34| (see also [17]) that even 



quantum protocols require communication Q(n) for this problem, even when 
the error probability is permitted to be as large as (say) g. This fact, combined 
with Theorem H applied in its contrapositive form, can be used to establish a 



lower bound for the parity problem in the query model (defined in Eq. 16). 
The main observation is that IP(x, y) = PARITY (f x A f y ). Suppose that there 
is a quantum algorithm that computes PARITY {f) for / : {0, 1} — > {0, 1} 
by making T(k) /-queries (assume that the error probability is bounded by 
3). Then, by Theorem || there exists a quantum protocol that solves IP 
with 0{T(k)k) qubits of communication, where n = 2 k is the size of the 
input instance to IP. Since there is a lower bound of Q(n) = 0(2 ) for the 
communication complexity of IP, we must have T(k)k G 0(2 fc ), which implies 
that T(k) G Q(2 /k). This is an easy way to get a "ball park" lower bound 
for the query complexity of PARITY, whose exact value is known to be ^ k 
by other methods [||, 24 1. 
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